Steph Curry is one of the best NBA players of all time and is also considered the best three-point shooter in the league. In the following text we deal with the topic of basketball with a focus on the three-point shot.
This document contains the first home work assignment in the SBD2 module in the autumn semester 2024.
The statistical fact is that the claim in basketball is that if one team takes more 3-point shots per game than the other team, they win the game.
Let’s first take a look at the total of 25’738 games played over the past 20 seasons. I’ve created a new variable from the dataset that tells us whether the team with more attempted three-pointers won the game or not.
The bar chart below shows that the wins and losses are roughly equally distributed. There is a slightly higher number of defeats.
Based on this graph, we find that 48.8% of games are won when a team takes more three-pointers. In 51.2% of the cases, the team that took fewer three-pointers won.
As already described in the section above, the attempted three-pointers have no influence on victory or defeat. To prove this statistically, I have created a logistic regression
In this logistic regression, I find that the graph is significant because the p value is below 0.05.
##
## Call:
## glm(formula = win ~ `3PA`, family = binomial, data = data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.0492105 0.0241178 -2.040 0.0413 *
## `3PA` 0.0020151 0.0009193 2.192 0.0284 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 71361 on 51475 degrees of freedom
## Residual deviance: 71356 on 51474 degrees of freedom
## AIC: 71360
##
## Number of Fisher Scoring iterations: 3
So that I can still show a statistical validation in my project, I next create a correlation coefficient that shows how the two variables win and 3PA correlate. I also add the variable 3P%, as I am also interested in this.
I realise that the two variables win and 3PA do not correlate. It is astonishing how exactly they do not correlate, as the correlation coefficient is rounded to 0.01.
As described in part one, I suspect this has changed over the past few seasons as the three-point evolution has taken place.
This graph shows that the trend line (blue) has moved between 50.1% and 47.4% in recent years. This indicates that the value has not changed significantly over a long period of time and that the three-point evolution has had no influence on the winning percentage.
As mentioned in the chapter above, a three-point shooting revolution took place in the 2010s. However, although teams in the modern NBA (from 2015) score significantly more points from three-pointers, this has no influence on winning.
In this chart, I illustrate the way teams score their points with yearly averages. This graph again clearly shows that a lot more three-pointers are being taken from 2015 onwards.
The data set I found offers many possibilities to compare the past twenty seasons. The claim, which I described at the beginning, is whether a team with more three-point attempts in a game than the other also wins the game. I tried to find this out with different statistical results.
At the beginning, I only compared the wins and losses of the teams that took more three-point shots in a game. This gave me an overview that the total number is almost the same. This was also the case with the results of Scipiotheyounger’s article.Compared to the article, however, I have compared not one but 20 seasons, which means that my analysis covers a longer period of time.
I also created a logistic regression which compared the two variables 3PA and win. I used the logistic regression because the win variable is binary. The p value was less than 0.05 which indicates that it is significant. The results of this logistic regression indicate that the variable 3PA has a very small influence on the variable win.
Afterwards I created a correlation matrix, for me personally this result says the most. Because it is clear that the correlation between the variables win and 3PA is 0.01. This means that the more three-point shots attempted do not indicate whether a team wins.
I also compared that the winning percentage with more three-point attempts has only changed slightly in recent years. There is a slight downward trend. In my analysis, this is an extension of the article by Scipiotheyounger, as he only used one season as the data basis.
I also show the three-point evolution in more detail in the last diagram. This is to show how much the style of play in the NBA has changed in recent years. At the beginning I thought that the three-point revolution might have an influence on my claim, but this was not confirmed. It is interesting to note that no fewer free throws and two-point attempts are being taken, but the volume of three-point attempts has increased.
One limitation of this data set is that it is unclear how external factors influence the attempted threes. For example, the defense was changed during a game or good three-point shooters changed teams. Do three-point shooters play worse in foreign stadiums than in their own? I can’t answer any of these questions based on this data set. I would need a data set that contains detailed player information.
In this analysis, I focus on the data and whether or not the 3PA variable has an impact on the win. However, I think it is important to emphasise that basketball is played by humans. And people make irrational decisions or mistakes in pressure situations, like a basketball game is. This is normal, but is difficult to prove with data. However, I can confirm that Scipiotheyounger’s claim was confirmed by my analysis. I have proven that the attempted three-point shots have no effect on the win or loss.
In conclusion, it can be said that the claim, if one team takes more 3-point shots per game than the other team, they win the game, is not true.